Search CORE

363 research outputs found

Sublinear algorithms for local graph centrality estimation

Author: Bressan Marco
Peserico Enoch
Pretto Luca
Publication venue
Publication date: 01/01/2018
Field of study

We study the complexity of local graph centrality estimation, with the goal of approximating the centrality score of a given target node while exploring only a sublinear number of nodes/arcs of the graph and performing a sublinear number of elementary operations. We develop a technique, that we apply to the PageRank and Heat Kernel centralities, for building a low-variance score estimator through a local exploration of the graph. We obtain an algorithm that, given any node in any graph of

m

arcs, with probability

(1-\delta)

computes a multiplicative

(1\pm\epsilon)

-approximation of its score by examining only

\tilde{O}(\min(m^{2/3} \Delta^{1/3} d^{-2/3},\, m^{4/5} d^{-3/5}))

nodes/arcs, where

\Delta

and

d

are respectively the maximum and average outdegree of the graph (omitting for readability

\operatorname{poly}(\epsilon^{-1})

and

\operatorname{polylog}(\delta^{-1})

factors). A similar bound holds for computational complexity. We also prove a lower bound of

\Omega(\min(m^{1/2} \Delta^{1/2} d^{-1/2}, \, m^{2/3} d^{-1/3}))

for both query complexity and computational complexity. Moreover, our technique yields a

\tilde{O}(n^{2/3})

query complexity algorithm for the graph access model of [Brautbar et al., 2010], widely used in social network mining; we show this algorithm is optimal up to a sublogarithmic factor. These are the first algorithms yielding worst-case sublinear bounds for general directed graphs and any choice of the target node.Comment: 29 pages, 1 figur

arXiv.org e-Print Archive

Crossref

Archivio della ricerca- Università di Roma La Sapienza

Archivio istituzionale della ricerca - Università di Padova

Faster Subgraph Counting in Sparse Graphs

Author: Bressan Marco
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 14th International Symposium on Parameterized and Exact Computation (IPEC 2019)
Publication date: 01/01/2019
Field of study

A fundamental graph problem asks to compute the number of induced copies of a k-node pattern graph H in an n-node graph G. The fastest algorithm to date is still the 35-years-old algorithm by Nesetril and Poljak [Nesetril and Poljak, 1985], with running time f(k) * O(n^{omega floor[k/3] + 2}) where omega <=2.373 is the matrix multiplication exponent. In this work we show that, if one takes into account the degeneracy d of G, then the picture becomes substantially richer and leads to faster algorithms when G is sufficiently sparse. More precisely, after introducing a novel notion of graph width, the DAG-treewidth, we prove what follows. If H has DAG-treewidth tau(H) and G has degeneracy d, then the induced copies of H in G can be counted in time f(d,k) * O~(n^{tau(H)}); and, under the Exponential Time Hypothesis, no algorithm can solve the problem in time f(d,k) * n^{o(tau(H)/ln tau(H))} for all H. This result characterises the complexity of counting subgraphs in a d-degenerate graph. Developing bounds on tau(H), then, we obtain natural generalisations of classic results and faster algorithms for sparse graphs. For example, when d=O(poly log(n)) we can count the induced copies of any H in time f(k) * O~(n^{floor[k/4] + 2}), beating the Nesetril-Poljak algorithm by essentially a cubic factor in n

Dagstuhl Research Online Publication Server

Correlation Clustering with Adaptive Similarity Queries

Author: Bressan Marco
Cesa-Bianchi Nicolò
Paudice Andrea
Vitale Fabio
Publication venue
Publication date: 01/01/2019
Field of study

In correlation clustering, we are given

n

objects together with a binary similarity score between each pair of them. The goal is to partition the objects into clusters so to minimise the disagreements with the scores. In this work we investigate correlation clustering as an active learning problem: each similarity score can be learned by making a query, and the goal is to minimise both the disagreements and the total number of queries. On the one hand, we describe simple active learning algorithms, which provably achieve an almost optimal trade-off while giving cluster recovery guarantees, and we test them on different datasets. On the other hand, we prove information-theoretical bounds on the number of queries necessary to guarantee a prescribed disagreement bound. These results give a rich characterization of the trade-off between queries and clustering error

arXiv.org e-Print Archive

AIR Universita degli studi di Milano

INRIA a CCSD electronic archive server

HAL Descartes

Archivio della ricerca- Università di Roma La Sapienza

Hal-Diderot

Motif counting beyond five nodes

Author: Bressan Marco
Chierichetti Flavio
Kumar Ravi
Leucci Stefano
Panconesi Alessandro
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2018
Field of study

Counting graphlets is a well-studied problem in graph mining and social network analysis. Recently, several papers explored very simple and natural algorithms based on Monte Carlo sampling of Markov Chains (MC), and reported encouraging results. We show, perhaps surprisingly, that such algorithms are outperformed by color coding (CC) [2], a sophisticated algorithmic technique that we extend to the case of graphlet sampling and for which we prove strong statistical guarantees. Our computational experiments on graphs with millions of nodes show CC to be more accurate than MC; furthermore, we formally show that the mixing time of the MC approach is too high in general, even when the input graph has high conductance. All this comes at a price however. While MC is very efficient in terms of space, CC’s memory requirements become demanding when the size of the input graph and that of the graphlets grow. And yet, our experiments show that CC can push the limits of the state-of-the-art, both in terms of the size of the input graph and of that of the graphlets

Archivio della ricerca- Università di Roma La Sapienza

The Limits of Popularity-Based Recommendations, and the Role of Social Ties

Author: Bressan Marco
Leucci Stefano
Panconesi Alessandro
Raghavan Prabhakar
Terolli Erisa
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2016
Field of study

In this paper we introduce a mathematical model that captures some of the salient features of recommender systems that are based on popularity and that try to exploit social ties among the users. We show that, under very general conditions, the market always converges to a steady state, for which we are able to give an explicit form. Thanks to this we can tell rather precisely how much a market is altered by a recommendation system, and determine the power of users to influence others. Our theoretical results are complemented by experiments with real world social networks showing that social graphs prevent large market distortions in spite of the presence of highly influential users.Comment: 10 pages, 9 figures, KDD 201

arXiv.org e-Print Archive

Crossref

Archivio della ricerca- Università di Roma La Sapienza